GENERAL REGULARIZATION SCHEMES FOR SIGNAL 
DETECTION IN INVERSE PROBLEMS 
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Abstract. The authors discuss how general regularization schemes, in par- 
ticular linear regularization schemes and projection schemes, can be used to 
design tests for signal detection in statistical inverse problems. It is shown 
that such tests can attain the minimax separation rates when the regulariza- 
tion parameter is chosen appropriately. It is also shown how to modify these 
tests in order to obtain (up to a log log factor) a test which adapts to the 
unknown smoothness in the alternative. Moreover, the authors discuss how 
the so-called direct and indirect tests are related via interpolation properties. 



1. Introduction and motivation 

Statistical inverse problems have been intensively studied over the last years. 
Mainly, estimation of indirectly observed signals was considered. On the other 
hand, there are only a few studies concerned with signal detection, which is a 
problem of statistical testing. This is the core of the present paper. Precisely, 
we consider a statistical problem in Hilbert space, where we are given two Hilbert 
spaces H and K along with a (compact) linear operator T: H ^ K. Given the 
(unknown) element f (z H wc observe 

(1.1) Y = Tf + a^, 

where ^ is a Gaussian white noise, and a is a positive noise level. A large amount 
of attention has been payed to the estimation issue, where one wants to estimate 
the function / of interest, and control the associated error. We refer for instance to 
[5] for a review of existing methods in a deterministic setting (^ is a deterministic 
error satisfying ||^|| < 1). In the statistical framework, the noise ^ is not assumed 
to be bounded. In this case, there is a slight abuse of notation in using (|l.ip . We 
assume in fact that for all g S K, we can observe 

{Y,g)^{Tf,g)+a{^,g), 

where (•, •) denotes the scalar product in K. Details will be given in Section [51 In 
this context, we mention [5] or [^ among others for a review of existing method- 
ologies and related rates of convergence for estimation under Gaussian white noise. 
In this study, our aim is to test the null hypothesis that the (underlying true) 
signal / corresponds to a given signal /o against a non-parametric alternative. More 
formally, we test 



(1.2) 



Hq : f ^ /o, against Hi^p : / - /o G £, ||/ - /o|| > p, 



where f is a subset of H , and p > a given radius. The subset £ can be understood 
as a smoothness constraint on the remainder f — Jo, while the quantity p measures 
the amount of signal, different from /o, available in the observation. Following the 
setting, (|1.2p is known as a goodness-of-fit or a signal detection (when /o = 0) 
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testing problem. In the direct case, i.e. when T = Id, this problem has been 
widely investigated. We mention for instance seminal investigations proposed in 
[m [13l [M] . We refer also to [1] where a non-asymptotic approach is proposed. 

Concerning testing in inverse problems there exists, up to our knowledge, only 
few references, as e.g. [T3] and [TH]. In these contributions, a preliminary estima- 
tor / for the underlying signal / is used. This estimator is based on a spectral 
cut-off scheme in [TH], or on a refined version using Pinsker's filter in |TS]. All 
these approaches are based on the same truncated singular value decomposition 
(see Section 13.11 for more details) . Here we shall consider general linear estimators 
f = RY, using the data Y. Plainly, since /o and hence T/o are given, we can 
constrain the analysis to testing whether / = (no signal) against the alternative 
Hi,p '■ f ^ £, 11/11 > p, and we discuss this simplified model from now on. 

In the following, we will deal with level-a tests, i.e. measurable functions of the 
data with values in {0, 1}. By convention, we reject Hq if the test is equal to 1 and 
do not reject this hypothesis, otherwise. We are interested in the optimal value of 
p (see (|1.2p ) for which a prescribed level for the second kind error can be attained. 
More formally, given a fixed value of /3 g]0, 1[ and a level-a test, we are interested 
in the radius p{^a, P, £) defined as 

p($„,/3,£:)=infLeM+: sup P/($a = 0) < /3 I . 
i /££, ll/lt>P J 

From this, the minimax separation radius p(a, /3, £) can be defined as the smallest 
radius over all possible testing procedures, i.e. 

p{a,l3,£) = argminp($a,^,£:), 

and the minimum is over all level-a tests $„. We stress that this minimax sepa- 
ration radius will depend on the noise level a, and on spectral properties, both of 
the operator T which governs the equation (jl.ip . and of the class £, describing the 
smoothness of the alternative. 

Lower (and upper) bounds have already been established in order to characterize 
the behavior of this radius for different kind of smoothness assumptions (see for 
instance [TS] or [T^). Recent analysis of (classical) inverse problems adopts a 
different approach by measuring the smoothness inherent in the class £ relative 
to the operator T . By doing so, a unified treatment of moderately, severely and 
mildly ill-posed problems is possible. We take this paradigm here and consider the 
classes £ as source sets, see details in § [3l 

Also, previous analysis was restricted to the truncated singular value decompo- 
sition of the underlying operator T. This limits the applicability of the test proce- 
dures, since often a singular value decomposition is hardly available, for instance 
when considering partial differential equations on domains with noisy boundary 
data. Therefore, the objective in this study is to propose alternative testing proce- 
dures that match the previous minimax bounds. 

To this end we first consider general linear regularization in terms of an opera- 
tor R (Sections 12.21 fc [^75)) . and we shall then specify these as linear regularization 
(in Section [XT]) or projection schemes (in Section [?^ . respectively. In each case, 
we derive the corresponding minimax separation radii. Next the relation between 
testing based on the estimation of / (inverse test), and test based on the estima- 
tion of Tf (direct test) is discussed in Section S) Such discussion can already be 
found in 17J. However, here wc highlight that the relation between both problems 
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can be seen as a result of interpolation between smoothness spaces, the one which 
describes the signal / and the one which characterizes the smoothness of Tf. 

Finally, we shall establish in Section \5\ an adaptive test, which is based on a 
finite family of non-adaptive tests. It will be shown that this adaptive test, with 
an appropriately constructed finite family, is (up to a log log factor) as good as the 
best among the whole family of tests. 

2. Construction and calibration of the test 



Considering the testing problem (jl.2|) . most of the related tests are based on an 
estimation of ||/|p (|!/ — /o|P in the general case). Then, the idea is to reject Hq as 
soon as this estimation becomes too large with respect to a prescribed threshold. 
As outlined above, in order to estimate ||/|P where f E H, from the observations 
Y, cf. (Il.ip . we shall use a general linear reconstruction operator R: K -^ H. 

2.1. Notation and assumptions. First we will specify the assumption on the 
noise ^ in (jl.ll) . 

Assumption Al (Gaussian white noise). The noise ^ is a weak random element 
in K, which has absolute weak second moments. Specifically, for all g,gi,g2 G K, 
we have 

(e,.g>^AA(0,||.g|p), (andE[(e,gi)(e,52)] = (ffi,g2)). 

Notice that the second property is a consequence of the first, because bilinear 
forms in Hilbert space are determined by their values at the diagonal. Under 
such assumption, given any linear reconstruction operator R: K ^ H the element 
RY belongs to H almost surely, provided that i? is a Hilbert-Schmidt operator 
(Sazonov's Theorem). When specifying the reconstruction R in Sections [3 . II fc 15^ 
we shall always make sure that this is the case. Then the application of R to the 
data Y may be decomposed as 

(2.1) RY = RTf + aR^ = fR + aRt f e H, 

where fu := RTf denotes the noiseless (deterministic part) of RY. Along with 
the reconstruction RY the following quantities will prove important. First, we can 
compute the bias variance decomposition 

(2.2) E \\RYf = pT/f + a^E p^H' = WIrW^ + 8% 
where we introduce the variance of the estimator as 

(2.3) S\:^(j^¥.\\Rif ^a^ti[R*R], 

which is finite if _R is a Hilbert-Schmidt operator. In addition the following weak 
variance will play a role. 

(2.4) 4— cr^ sup E\{Rtw)\^ = a^ \\Rf . 

\\w\\<l 

Below, if R is clear from the context we sometimes abbreviate S = Sr and v = vr. 
We will need more precise representation of the trace and norm as above in terms 
of the representation of the operator R. Suppose that we have given R in terms of 
its singular value decomposition as 

oo 

(2.5) R9 = Y^ Aj {iPj ,g)<Pj, ge K, 

where we assume that both sequences {V'jIjgN ^^'^ {'PAj^n ^^^ orthonormal bases 
in K and H, respectively. Moreover, the sequence Xj, j — 1,2,. . . is assumed non- 
negative and arranged in non-increasing order. Then the following is well-known. 



CLEMENT MARTEAU AND PETER MATHE 



Lemma 2.1. Let R be as in i2.5\). Th~ 



(1) iv[R*R]^Y.7=i>^lcind 

(2) ||i?f =sup-iA2. 

From this we can see that v^ < S^, and typically these quantities differ by order. 
Some explicit computations will be provided below. 



2.2. Construction of the test and control of the first kind error. We see 

from (j2.2p that the quantity ||i?y|| — 5*^ is an unbiased estimator for the norm of 
||/ij|| . If i? is chosen appropriately, this term is an approximation of ||/||^, whose 
value is of first importance when considering the problem (jl.2p . Therefore, we shall 
use a threshold for Hi^ylj" — 5^ to describe the test. 

Let a e (0, 1) be the prescribed level for the first kind error, and we agree to 
abbreviate Xa '■— log(l/a). We define the test '^a,R as 

(2-^) *".-R = l{||j?FP-s|>tH,„}' 

where t/j_a denotes the \ — a quantile of the variable ||i?F|p — S^ under Hq. Due 
to the definition of the threshold t^jQ, the test $q,_r is a level-a test. Indeed 

PHo(*a,fl = 1) = Pn.myf - si > tR^o.) = a. 

We emphasize that under Hq the distribution of 1 1 i?y 1 1 ^ - 5"! = cr^ ( 1 1 i?^ 1 1 ^ - tr [i?* i?] ) 
only depends on the chosen reconstruction R. Hence the quantile can be deter- 
mined, at least approximately. Proposition 12.11 below establishes an upper bound 
for this quantile. 

Proposition 2.1. Let a he a fixed level. Then 

tR,a < '^V^x^Srvr + 2u|j.a;Q, 
where the quantities S^ and vj^ have been introduced in h2.S\) and {2.4-^ . 
Proof. First notice that under Hq, II^^IP = II'^'^^IP- Then we get 
Ph,{\\RY\? ~Sl> 2V2^Srvr + 2vlx^) 

= PHo{\\<yR^f-Sl>2V2^SRVR + 2vlxo.) 

= Pho (|ki?e||' - nWR^f > 2V2^Srvr + 2vlxa) 

where we have used Lemma lA.ll with x — \/2x^vr, in order to get the last inequal- 
ity. Hence, 

PHoiWRYf - S% > 2V2^Srvr + 2v\x^) < a, 
which leads to the desired result. D 

2.3. Controlling the second kind error. Here, our aim is to control the second 
kind error by some prescribed level /? > 0, and again we abbreviate x^g := log(l//3). 
To this end, we have to exhibit conditions on / for which the probability Pf{^a = 0) 
will be bounded by (3. By construction of the above test this amounts to bounding 

P/($„ = 0) = Pf{\\RYf-S^<tR^,,) 

= PfiWRYf - E ||i?r||^ < tR^a + S^ -E \\RYf) 

(2.7) = PfiWRVf -E\\RY\\^ <tR,^-\\fRf), 

where the latter follows from (|2.2I) . In this section, we will investigate the lowest 
possible value of ||/_r|P for which the previous probability can be bounded by /?. 
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Let P e]0, 1[ be fixed. For all f E H, we denote by tfi^i3{f) the /3-quantile of the 
variable ||i?F|p — 5^. In other words 

(2.8) Pf iWRYf - E||i?y||2 < tR^pif)) = /3. 

Then, we get from (I2.7p and (|2.8p that Pf{^a.R = 0) will be bounded by /3 as soon 
as 

(2.9) tR^a. - Whf < tRAf) ^ ll/flll' > ii^,a - tR^f). 

We have already an upper bound on the 1 — a-quantile ta^R. In order to conclude 
this discussion, we need a lower bound on tR^p{f). 

Lemma 2.2. Let the reconstruction R be given as in i2.5\) . and let 

CX3 OO 

(2.10) ^:^Y.-I + ^T.-' 
Then 



OO 



Proof. We first show the relation of the problem to a specific sequence space model. 
By construction of R, using (|2.5p , we can expand 

OO OO OO 

i?r = ^A,(^„r)</), = 5]A,(7^„T/)0, + f7 5]A,(v„O'^.-, 

j^l j^i j^i 

OO OO 

where 0j := Xj{'ipj,Tf) and ctj := ctAj for all j S N, and the Sj are i.i.d. standard 
Gaussian random variables. Then, we can apply Lemma lA. 21 which gives 

P{\\RYf - E \\RYf < -2 VS^) < /3, 
which completed the proof. D 

We are now able to find a condition on ||/_r|P in order to control the second kind 
error. We introduce the following quantity 

(2.11) Q^ = (4^x^ + 472^), 
which is a function of a and /3, only. 

Proposition 2.2. Let us consider the test ^a,B. o.s introduced in i2.6\) . and let 

(2.12) r2($a,fl, /3) := C*^,^Sv + (4a;„ + Sxp^. 
Then 

sup P/(*a.fl, =0)</3. 

/,ll/nP>'-"(*o.-R,/3) 

Proof. The equation (|2.9p provides a condition for which P/(<I'q — 0) < /3. Using 
Proposition 12.11 and Lemma [221 we see that this condition is satisfied as soon as 



||/i?.|P > 2y/j^^ + 2V2s~Sv + 2v'^Xa. 
Now we bound 

-j-oo -|-oo 

< 5V + 2t;2||/^,f. 
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Using the inequality {ab < a /2 + 6/2 for all a,b £ M), we get 



2,/S^ < 2Svy/^+2^/2^\\f\\v, 

< 2Svy^+^\\ff + Axf,v^. 
In particular, the condition (j2.9p will be satisfied as soon as 

2ll/i?ll' > i2^^+2V2^)Sv + v^2xa, + Axp). 



U 



Remark 2.1. Please note that the condition on ||/ii;P is (as most of the results 
presented below) non-asymptotic, i.e. we do not require that cr — > 0. Using, the 
property v < S, we can obtain the simple bound 

(2.13) r'^i^a,^) <Ca,i3Sv, where Ca,/3 = 4^/x^-|-4V2x^ + 4xc-|-8a;/3. 

In an asymptotic setting, the value of the constant Ca,i3 may sometimes be im- 
proved. In particular, the majorization v < S is rather rough. In many cases, we 
will only deal with the constant C* a, and we refer to Corollarv l3.1l 



3. Determining the separation radius under smoothness 

We have seen in the previous section that we need to have that W/rW > Ca^pSv 
in order to control the second kind error. Nevertheless, the alternative in (|1.2p is 
expressed in term of a lower bound on ||/||'^. In this section, we take advantage on 
the smoothness of / in order to propose a upper bound on the separation radius. 

Using a triangle inequality, we obtain 

||/iV,||>||./||-||/-./flll. 
Hence, ||/ii:|l^ > r^(<J>Q,^) as soon as 

11/11- 11/ -/i^ll>r(<i>a,/3), 
^ 11/11' >(r($a,/3) + 11/ -./fill)', 
^ \\fr>2r\^^,P) + 2\\,f-fR\\\ 
In other words, we get from Proposition 12.21 that 

(3.1) sup P/($a,fl = 0)</3. 

/,||/||2>2r2($„,/3)+2||/--/HP 

Hence we need to make the lower bound on ||/|| as small as possible. We aim at 
finding sharp upper bounds for 

(3.2) mf^(r2($„,/3) + ||/-/;,,||' 

where the reconstructions R belong to certain families TZ. We shall establish or- 
der optimal bounds in two cases, the case of linear regularization and by using 
projection schemes. 

As already mentioned, we shall measure the smoothness relative to the oper- 
ator T, and this is done as follows. Since the operator T is compact so is the 
self-adjoint companion T*T. The range of T*T is a (dense) subset in iJ, and one 
may consider an element / smooth, if it is in the range of T*T . To be more flexible, 
we shall do this for more general (operator) functions ip{T*T). The corresponding 
operator (p{T*T) is compact, whenever, (p{t) — >■ as i — >■ 0. Therefore, we shall 
restrict to functions with this property. Precisely, we let 

(3.3) £^ = {heH, h^ ip{T*T)u, for some ||a;|| < 1} , 

for a continuous non-decreasing function ip which obeys ^{0) = (index function), 
be a general source set. It was established in |20) that each element in H has some 
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smoothness of this kind, and hence the present approach is most generaL Examples, 
which relate Sobolev type balls to the present setup are given in Examples [3] & HI 

3.1. Linear regularization. We recall the notion of linear regularization, see 
e.g. [Ill Definition 2.2]. Such approaches are rather popular for estimation purpose. 
In this section, we describe how these can be tuned in order to obtain suitable tests. 

Definition 1 (linear regularization). A family of functions 

gr ■■ (0, ||r*T||] K^ R, < T < ||T*T|| , 

is called regularization if they are piece-wise continuous in t and the following 
properties hold: 

(1) For each < i < ||T*T|| we have that |r^(i)| ^ as r -> 0; 

(2) There is a constant 71 such that supg<(<||ji.-p|| |7'T-(i)| < 71 for all < t < 
\\T*T\\; 

(3) There is a constant 7* > 1 such that s\lpQ^^^nrp,rpn T\gr(t)\ < 7* for all 
< T < 00, 

where rr{t) := 1 — tgr{t), < i < ||T*r|| , denotes the residual function. 

Notice, that in contrast to the usual convention we used the symbol r instead of 
a, as the latter is used as control parameter for the error of the first kind. 

Having chosen a specific regularization scheme gr we assign as reconstruction 
the linear mapping Rr := gr{T*T)T* : K ^^ H . Notice that now, the element fn 
is obtained as fR = fr= gr{T*T)T*T f . 

Example 1 (truncated svd, spectral cut-off). Let {sj,Uj, Wj)j6N be the singular value 
decomposition of the operator T, i.e., we have that 

00 

and the singular numbers si > S2 • • • > are arranged in decreasing order. With 
this notation we can use the function grit) '■— 1/t, t > t and zero else. This means 
that we approximate the inverse mapping of T by the finite expansion Rt-Y := 
X]s2>r 7"(^' '''j)'"j' ^ ^ ^"^^ "^^^ condition s^ > t translates to an upper bound 
1 < j < -D = D{t). The element fr is then given as fr — X]i=i(/i '^j)'^j- 

Example 2 (Tikhonov regularization) . Another common linear regularization scheme 
is given with grit) — l/(i + r), t, r > 0. In this case we have that Rt-Y = 
{tI + T*T)~ T*Y, i.e., this is the minimizer of the penalized least squares func- 
tional Mf) :=\\Y- Tff + T\\f\\\ feH. 

Having chosen any linear regularization, we would like to bound the quantities 
S^ = S%vl = v% from (jOl) . ((^^ (with a slight abuse of notation). To this end, 
we will impose the following assumption. 

Assumption A2. The operator T is a Hilbert-Schmidt operator, i.e., 

tr[T*r] < +CXD. 

Under the above assumption, the reconstructions R^ are also Hilbert-Schmidt 
operators, since these are compositions involving T* . 

In the following, we shall use the effective dimension which allows to construct 
a bound on the variance S"^. 

Definition 2 (effective dimension, see [HIISS])- The function A h^ A/'(A) defined as 

(3.4) N{\) := tr [{T*T + XI)-^T*T] 

is called effective dimension of the operator T*T under white noise. 
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By Assumption I A2 1 1 he operator T*T has a finite trace, and the operator {T*T + 
XI)^^ is bounded, thus the function J\f is finite. The fohowing bound is a conse- 
quence of [H Lem. 3.1]. 



2frj.*rT.',rj.*rr.l ^ 0„.2-^(^) 



(3.5) tr [gl{T*T)T*T] < 2% 



T 

for some constant 7,^ > 0. This, and using the definition of regularization schemes, 
results in the foUowing bounds. 

Lemma 3.1. Let Rr :— gr{T*T)T* : K ^t H . Assume that Assumption A2 holds, 
then we have that 

(i) Si < 2-ila^—^, T > 0, and 

T 

(ii) "r < iW-. ^ > 0. 

T 

Proof. The proof is a direct consequence of the definition of S'^, v^ and of (j3.5p . D 

The previous lemma only provides upper bounds for the terms Sr and Vr- For 
many linear regularization schemes we can actually show that v^-fSr — ?> as r — >■ 0, 
and we mention the following result. 

Lemma 3.2. Suppose that the regularization g^ has the following properties. 

(1) There are constants c, 7 > such that \gr{cT)\ > j/t for a > 0, and 

(2) for each < t < ||T*r|| the function r — )• |gT(i)| is decreasing. 
If the singular numbers of the operator T decay moderately, such that 

# {h CT <s']< c/ct] -> 00 as t ^ 0, then tr [Tgl{T*T)T*T] -> cx) as t ^ 0. 
Consequently, in this case we have that Vr/ St ^- as r — )■ 0. 

Proof. For the first assertion we bound, given an a > 0, and using the singular 
numbers Sj of the operator T, the trace as follows. We abbreviate, for Sj > ca the 
value 13 j :— Sj/c. Then for any < c < 1 we find that 

00 






tr [rgUT*T)T*T] = V rg?(s^)s^ > V rgUs^)s^ 



Finally, by Lemma |3. II we find that 



— <7 
09 — ' 



2 



52 - '*TtT[g^{T*T)T*Ty 
and the second assertion is a consequence of the first one. D 

Remark 3.1. The assumptions which are imposed above on gr are known to hold 
for many regularization schemes, in particular for spectral cut-off and (iterated) 
Tikhonov regularization. The assumption on the singular numbers hold for (at 
most) polynomial decay. 

Lemma 13.21 implies that in some specified cases the separation radius defined in 
(I2.12P is of size C^ aSrVr as r — >■ 0. This is summarized in the following corollary. 

Corollary 3.1. Let C'^ ^ and r'^{<^a.R,P) be as in h2.11]) and h2.12]) . respectively. 
Under the assumptions of Lemma \3.2\ we have that 

rH^c.M,f3) 



^a,li^rVT 



-^1 as r -^ 0. 
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We turn to bounding the bias |j/ — /r|| • This can be done under the assumption 
that the chosen regularization has enough quahfication, see e.g. [TT| . 

Definition 3 (quahfication). Suppose that Lp is an index function. The regulariza- 
tion g-T is said to have qualification Lp if there is a constant 7 < oo such that 

sup \rT{t)\ f{t) < Ifir), T > 0. 

0<t<||T*T|| 

Remark 3.2. It is weh known that Tikhonov regularization has quahfication ip(t) = t 
with constant 7 = 1, and this is the maximal power. On the other hand, truncated 
svd has arbitrary qualification with constant 7 = 1. 

In this case we can bound the bias at /;? = /r. 

Proposition 3.1. Let g-^ he any regularization having qualification ip with con- 
stant 7. If f Cz Sip then 

11/ -Ml <7^(t). 
Proof Let uj with ||a;|| < 1 such that / = tp{T*T)uj. Then 
ll/r - f\\ = \\gr{T*T)T*Tf - f\\ = \\rr{T*T)f\\ = \\rr{T*T)^{T*T)u\\ < j^{t). 

D 

Now we have established bounds for all quantities occurring in (13. 2p . and this 
yields the main result for linear regularization. 



Theorem 3.1. Assume that Assumvtion \AS\ holds, and suppose that g^ is a reg- 
ularization which has qualification tp, and that f £ E^p. Let r, he chosen from the 
equation 



1, ^_ 2^fW) 



(3.6) ^>\t)=o 

Then, for all f ^ £,^, 

inf (.^($., /3) + 11/ - Mr) < [ci,V2,l + ^'^-Jj^^'* + 7^) ^\r.l 
where the constant C* a has heen introduced in \2.11]) . In particular, we get that 

Proof. By Proposition 13.11 and Proposition l2.21 we have that 

r2($„,/3) + ||/-M|2 
= Clj,Sv + (4x„ + 8x^)«2 + 11/ _ /,||2 , 



< CI ^^la^:^lE^ + (4x„ + 8x^)7>^i + 7V'(r), 

T T 

< L-a./3V 27, H + 7 U' T* , 



since the parameter r* equates both terms f'^ir) and cr^r ^^A/'(t). This gives the 
upper bound. D 
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Remark 3.3. Up to now, all the presented results are non-asymptotic in the sense 
that we do not require that a^ — > 0. In an asymptotic setting, we can remark that 
T* as defined in (j3.6p satisfies r^,, —> as cr —> 0. Since the effective dimension tends 
to infinity as r — >■ 0, we get that 

p'{<i>a.r„l3,£^) < 2 (c:,^\/272(l + o(l)) +7^) ^^(r,), 

as cr —>■ 0. 

We shall highlight the above results with two examples. We shall dwell into 
these in order to show that the above results are consistent with other results for 
inverse testing (see for instance [17]). 

Example 3 (moderately ill-posed problem). Let us assume that the singular num- 
bers of the operator T decay as Sk ^ k~^, fc € N, with t > 1/2 (in order to ensure 
that Assumption IA2I is satisfied) . In this case the effective dimension asymptot- 
ically behaves like A/'(r) x r~^/*^^*\ as t — > 0, see for instance [41 Ex. 3]. The 
Sobolev ball 

(3.7) £,^2 ^= W' E «?(/' '^.') ^ ^' [ ' ^ith «^ = ^'^ ^J' > 1' 

as considered in 18_ coincides (up to constants) with S^p for the function (p{u) = 
ys/(2t)^ M > 0. In this case the value r* from (J3.6I) is computed as r* x (j8t/(4s+4t+i)^ 
which results in an asymptotic separation rate of 

which corresponds to the 'mildly ill-posed case' in J18j or [15] . and it is known to 
be minimax. 

Exam,ple 4 (severely ill-posed problem). Here we assume a decay of the form 
Sk X exp(— 7A:), fc S N of the singular numbers. The effective dimension be- 
haves like N{t) X ilog(l/r). The Sobolev ball from (|3.7I) is now given as £^ 

for a function ip{u) — I J- log(l/u) j . Then the value t* calculates as r* x 
CT^ (log(l/cr^)) , which results in a separation rate 

p($„,,.,/3,f^) X (^(n) X log-^(l/a2), a ^ 0, 
again recovering the corresponding result from [18| . 

3.2. Projection schemes. Here we follow the ideas from the previous section. 
Details on the solution of ill-posed equations by using projection schemes can be 
found in [3T| |2S1 I2S] : and our outline follows the recent [H] . In particular we use 
the intrinsic requirements such as quasi-optimality and robustness of projection 
schemes in order to obtain a control similar to the previous section. 

We fix a finite dimensional subspace iJ,„ C H, called the design space and/or 
a finite dimensional subspace Kn C K, called the data space. Throughout we 
shall denote the corresponding orthogonal projections onto Hm by Pm, and/or 
the orthogonal projection onto Kn by Qn. The subscripts m and n denote the 
dimensions of the spaces. Given such couple (-ffm, Kn) of spaces we turn from the 
equation (|l.ip to its discretization 

(3.8) QnY = QnTPmX + aQn^- 

Without further assumptions, the finite dimensional equation p.8p may have no or 
many solutions, and hence we shall turn to the least-squares solution as given by 
the Moore-Penrose inverse, i.e., we assign 
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Definition 4 (projection scheme, see |5T1). // we are given 

(1) an increasing sequence Hi C i?2 ■ • • C H , and 

(2) an increasing sequence Ki C K2 • • • C K , together with 

(3) a mapping m — ?► n{m), m — 1,2, ... , 
then the corresponding sequence of mappings 

(3.10) Y ^ /„,„(„) := (QnTPrn)^ Y 

is called projection scheme. 

Example 5 (truncated svd, spectral cut-ofF). The truncated svd, as introduced in 
Example [1] is also an example for a projection scheme, if we use the increasing 
sequences H„i :— spanJMi, . . . , w„i} C H, and K„i :— spanjwi, . . . , «„} C K, 
respectively. In this case we see that (QnTPm) Y ~ S^Li 7"(^'^j)- 

Henceforth we shall always assume that the mapping (QnTPm) '■ Kn — > i?m 
is invertible, i.e., the related linear system of equations has a unique solution. 
This gives an (implicit) relation n — n{m), typically n — n will do. However, 
our subsequent analysis will be done using the dimension m of the space Hm for 
quantification. In accordance with this we will denote f^ by /„, highlighting 
the dependence on the dimension. Thus the linear reconstruction R is given as 
R '■= (QnTPm) , and we need to control tr[i?*_R] as well as ||i?||. The latter is 
related to the robustness (stability) of the scheme. 

Definition 5 (Robustness). A projection scheme I (QnTPm) , to G N) is said to 
be robust if there is a constant D^ < 00 for which 



(3.11) (QnTP, 



< .,r!^l y TO=1,2, 
j(T,Hm) 



Here, the quantity ](T,Hm) denotes the modulus of injectivity ofT with respect to 
the subspace Hm, given as 

The modulus of injectivity is always smaller than the m-ih singular number Sm = 
Sm(T) of the mapping T, and hence we say that the subspaces Hm satisfy a 
Bernstein-type inequality if there is a constant < Cb < 1 such that 

CBSm(T)<i(T,H„-,). 

We summarize our previous outline as follows. 

Lemma 3.3. Suppose that the projection scheme I (QnTPm) , to G N) is robust 
and that the spaces Hm obey a Bernstein-type inequality. Then 

Dr 1 



\Wn^ ^n 



< 

Cb s,; 



In particular we have that 



^B *m 



We turn to bounding Sj^^. Before doing so we mention that for spectral cut-off 
from Example [5l this bound can easily be established. 

Lemma 3.4. For spectral cut-off we have 



Si = a^ tr 



((QnTPm)^) (QnTPn 
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In order to achieve a similar bound in more general situations we need to impose 
restrictions on the decay of the singular numbers Sj, j — 1,2,... The use of projec- 
tion schemes for severely ill-posed problems requires particular care, and the follow- 
ing restriction, which will be imposed on the decay of the singular numbers of the 
operator T takes this into account. We shall assume that the decreasing sequence 
Sj, J = 1, 2, . . . , is regularly varying for some index — r, for some r > 0, and we refer 
to [5] for a treatment. In particular this covers moderately ill-posed problems where 



J" 



We will not use the index r. However, if the sequence Sj,j = 1,2,.. 



is regularly varying with index — r then the sequence s ■ , j = 1, 2, . . . , is regularly 
varying with index 2r, and we have that 



1 



2 Y^ 



1 



1 



'-'td / n 



m 



m / J 



2r + l 



as TO — >■ oo. 



In particular there is a constant Cr such that 

-.2 Y^ J- 



(3.13) 



<a 



r ^_^ 2 ■ 
3 = 1 ^^ 



and the latter bound is actually all that is needed. 

Lemma 3.5. Suppose that the sequence Sj,j — 1, 2, ... , is such that for the constant 
Cr the estimate I13.13\) holds. If the projection scheme is robust with constant Dn, 
and if the spaces Hn obey a Bernstein-type inequality with constant C'b then 



S^ 



R 



5-2 



CT^tr 



[{QnTP^)^)* {QnTPrr 






//, in addition the Assumvtion \AB is satisfied, then we have that 

q2 _ q2 ^ r<2 ^R 2 -^ \^m) 
'^R ■— ^m ^ ^r ^2 " „2 

'^ B '^m 

Proof. We notice that the mapping [{QnTPm)) is zero on H^, the orthogonal 
complement of Hm- So, we take an orthonormal system mi, U2, ■ ■ ■ , Um, ■ • ■ , where 
the first m components span Hm- With respect to this system we see that 



tr 



[{QnTP^)^)* [QnTP^)^ 



tr [{QnTP^)^ ((Q„TP„)t 

OO 



i=i 



i=i 



|((Q„TP™)^ 



< m 



\{{Qn 



TP„ 



\}^n^ ^ni) 



{QnTP^nfy {QnTP,J 



< m-p^A-. Now we 



Using Lemma 13.31 we see that tr 

use (I3.13P to complete the proof of the first assertion. Under Assumption IA2I we 
continue and use the inequality u/v < 2v/{u + v), < m < f , to see that 

£.2 



V— < — 



Z^ ,2 

3 = 1 ^^ 



< 2 



Uis 



and the proof is complete. 
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Remark 3.4. Notice that Lemma 13.51 provides us with (an order optimal) bomid 
for the variance, even if the operator T is not a Hilbert-Schmidt one. But, if it is 



then the obtained bound corresponds to the one from Lemma l3. II (with r -s— sf^). 

Next, we need to bound ||/ — /flJI , as this was done in § I3.1l bv assuming qual- 
ification, and we need a further property of the projection scheme, called quasi- 
optimality. We start with the following well-known result, originally from spline 
interpolation |5|, and used for projection schemes in [53], which states that 



(3.14) 



T 



Therefore, we can bound the bias whenever the norms 
bounded. 



{QnTPra)^T 



are uniformly 



Definition 6 (quasi-optimality). A projection scheme Y — > (QnTPm) Y is quasi- 
optimal if there is a constant Dq such that [QnTPm) T < Dq. 

We emphasize that under quasi-optimality the bound for the bias entirely de- 
pends on the quality of the projections Pm with respect to the element /. 

Definition 7 (Degree of approximation). Suppose that {Hm} , dim(i?m) < "^^ 
is a nested set of design spaces. The spaces Hm are said to have the degree of 
approximation ip it there is a constant Cd < oo with 

(3.15) \\{I-Pm)v{T*T)\\<CD^{sm+i), m = l,2,... 

For spectral cut-off this bound (with constant C = 1) is best possible. Also, 
using interpolation type inequalities one can verify this property for many known 
approximation spaces Hm^ to = 1, 2, . . . , we refer to [21] for more details on degree 
of approximation and Bernstein-type bounds. We now can state the analogue of 
Proposition 13.11 for projection schemes. 

Proposition 3.2. Suppose that the projection scheme is quasi-optimal with con- 
stant Dq , and that it has the degree of approximation ip with constant Cd ■ If f ^ Sip 
then we have that 

\\f-fm\\ <DqCdV{sU^) 

We now return to the problem raised in p.2p . Here, the family of reconstructions 
R runs over all projection schemes, and we can control the bound by a proper choice 
of the discretization level m. 

For the sake of convenience, we will assume in the following that Assumption I A2 1 
is satisfied, i.e. that T is a Hilbert-Schmidt operator. If it is not the case. Theo- 
rem [321 below remains valid when replacing y/J^ls^/Sm by yX^fLi ^7 /*m- 

Theorem 3.2. Suppose that the approximate solutions are obtained by a projection 
scheme which is quasi-optimal and robust and that Assumvtion lA^ holds. Further- 
more assume that the design spaces Hm have degree of approximation ip and obey 
a Bernstein-type inequality. Let m^ be chosen from 



(3.16) 



m* 



max < m, 



V^sl) > a- 



VA^ 



If f £ £,n then we have that 



inf 



[r\^o.,P) + \\f-fra\?} 



D2 



^ \Cl,p-^Cr + {Axa. + ^xp 



^R 



ci^M^ 



+ DlCl\^\slJ, 
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where the constant C* n has been introduced in i2.11\) . 

Proof. By using Lemma 13.51 and Proposition 13.21 we see that for any choice of 
discretization level m we have 



^ ^* tLjiri .,2 V-^\^m) , /.^ , o^ \^2:^^ 1 , n2 r'Z ,o2('„2 \ 
- '~'a,0-7^'-^ra- -^ + [AXa + Sx^JCT ^^ 72~ + ^Q^Of (Srn+l) 



j 2V^^ 2,2 ,1 

X max <^ a ,Lp (s„+i) ^ . 

At the discretization level m^^ + 1 we see by monotonicity that 

Also, by the choice of to* we see that 

2 V'^y^mJ ^ 2i 2 \ 
(^ 2 - ^ (*™J' 

hence both terms in the max are dominated by ip'^{s'^-^ ), which allows us to complete 
the proof. D 

Once again, the previous result is non-asymptotic. In the asymptotic regime, we 
get the following improvement. 

Corollary 3.2. Under the assumptions of Theorem \3.2\ we get that 

inf {r2($„,/3) + 11/ - /,„||2} < (^c;^^a(l + 0(1)) + 0^01^ ^^slj, 
as (T — > 0. 



This is an easy consequence of the fact that along with cr — ^ we have s^^ — ?> 0, 
and hence the effective dimension at s^ tends to infinity. 

3.3. Discussion. We first highlight the important fact that in both cases (provided 
Assumption IA2l is satisfied), linear regularization and for projection schemes the 
upper bound is obtained by solving the same 'equation', cr"^ — Tip'^lr)/ y/J\f{T), such 
that relating r* ~ s^^ , see Theorems 13. Il and l3. 21 However, this function is different 
from the one used for function estimation in inverse problems. In the same setting 
the 'optimal' parameter r^st is there obtained from solving 



ip^r) 



2^{r) 



T 

Thus, the effective dimension A/", which is designed for estimation enters in the 
inverse testing problem in square root, such that loosely speaking testing is easier. 
Another remark may be of interest. For the estimation problem, within the same 
context, the bias variance decomposition leads to a variance term S'^, and in order 
to achieve optimal order reconstruction, this will be calibrated with the function 
tf^. As we have seen above, for testing the same calibration is done between the 
functions Sv and ^p'^ . Since, as already mentioned Sv < S^ this calibration always 
yields a smaller value, which again explains the different rates for separation radius 
and estimation error. 



4 



GENERAL REGULARIZATION SCHEMES FOR SIGNAL DETECTION 15 

Previous analysis of the spectral cut-off regularization scheme for testing in in- 
verse problems revealed the importance of the quantity 

(\ 1/4 

We mention the non-asymptotic lower and upper bounds, slightly adapted to the 
present setup, given for instance in |18) as 

p'^{£^,a,(3) > supmin{c^^^p|,,(^^(s|,)} , 

p'{£^,a,f3)<mi{Clf,pl + ^^sl)) 

Thus the bounds established in this study are sharp whenever p|, x SdVd, where 
Sjj — J2j=i *7^' ^^'^ ^D = ^D^^ respectively. More explicitly, if 

3 = 1 "^J 

This concerns only the decay rate of the singular numbers Sj of the operator T, 
and this holds for regularly varying singular numbers, but this also holds true for 
Sj X exp(— 7J),j = 1,2,..., thus covering severely ill-posed problems. Remark 
that instead of the terms involved in (I3.17p . the quantities Sd and vd have nice 
interpretation as strong and weak variances of the spectral cut-off schemes. 

4. Relating the direct and inverse testing problem 

For injective linear operators T, the assertions " f — 0" and "T/ = 0" are 
equivalent. Hence, testing Hq : / = or testing Hq : Tf = is related to the same 
problem: we want to detect whether there is signal in the data. Nevertheless, these 
testing problems are different in the sense that the alternatives are not expressed 
in the same way. Indeed, the inverse testing problem (considered in the previous 
sections) corresponds to 

(4.1) H^-f = 0, against H( : f e £^, ||/f > (p^)^ 
while the direct testing problem corresponds to test 

(4.2) H^ : T/ = 0, against ilf : / € £^, \\Tff > {p'^f. 

In this section, we investigate the similarities between these two view points. In 
particular, we remark that both testing problems are not equivalent in the sense 
that the alternatives do not deal with the same object. 

4.1. Relating the separation rates. The authors in |T7] discussed whether both 
problems are related. The main result. Theorem 1, ibid, asserts that for a variety 
of cases each minimax test $„ for the direct problem {Hq : Tf = 0) is also minimax 
for the related inverse problem [Hq : f — 0). This fundamental results is based 
in Lemma 1, ibid. Here we show that this lemma has its origin in interpolation in 
variable Hilbert scales, and we refer to [12] • Actually we do not need the machinery 
as developed there, but we may use the following special case, which may directly 
be proved using Jensen's inequality. 

Lemma 4.1 (Interpolation inequality). Let ip be from i!^.3\) . and let Q{u) := 
y/u(p{u), u > 0. If the function u i—i' ip"^ I (6^) (u) ) is concave then 

(4.3) 11/11 <^(e-i(||r/||)), fe£^. 

The main result relating the direct and inverse testing problems is the following. 
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Theorem 4.1. Let tp be an index function with related function O, such that the 

function u f-> </j^ I (6^) (u) ) is concave. Let <&„ be a level-a test for the direct 

problem H^ : Tf = with uniform separation rate p^{^a,£e, P)- Then $„ con- 
stitutes a level-a test for the inverse problem Hq : / = with uniform separation 
rate 

Consequently we have for the minimax separation rates that 

(4.4) p^(£:^,a,/3)<^(e-i(p^(£e,a,/3))). 

Proof. Clearly, the test $„ is a level-a test for both problems, and we need to control 
the second kind error. But if ||/|| > ip (9~-^ (^p^{^a,£0, P))) then Lemma llTTJ vields 
that ||T/|| > p^{^a,S0,l3), and the assertion is a consequence of the properties of 
the test for the direct problem. 

If $Q, was minimax for the direct problem then the corresponding minimax 
rate for the inverse problem must be dominated by ip (O^^ (p^(fe,Q^,/3))), which 
gives (113). D 

Remark 4.1. In many cases the bound (14. 4p actually is an asymptotic equivalence 

(4.5) ^-i(p^(£^,a,/3))xe-i(p^(fe,a,/3)), a ^ 0. 

It may be enlightening to see this on the base of Example |3l Recall that the 
function ip was given as p(u) = u^'^'^^K The corresponding rate is known to be 
minimax, and we obtain that 

We turn to the direct problem, for which the corresponding smoothness class is £q 
for the function 6(u) = v?'/'y'^*-)+'^/'^ = y(«+*)/(s*), xhis corresponds io p = s -\- 1 
in [H, Tbl. 2], yielding the separation rate p{£e,a,/3) x a'^i'>+t)/{2s+2t+i/2) ^ ^j^^^j^ 
in turn gives 

e-i(p^(fe,a,/3))>='T^^+^VT72, 

and hence (|4.5p for moderately ill-posed problems. 

Similarly, this holds for severely ill-posed problems, and we omit details. 

We emphasize that, by virtue of Theorem 14.11 any lower bound for the mini- 
max separation rate in the inverse testing problem yields a lower bound for the 
corresponding direct problem. 

Remark 4.2. Thanks to Theorem 14.11 it is possible to prove that in all the cases 
considered in this paper, a test minimax for (|4.2p will be also minimax for (14. ip . 
Nevertheless, the reverse is not true. We will not dwell into details, instead we refer 
to [T7] for a detailed discussion on this subject. 

4.2. Designing tests for the direct problem. The coincidence in (14.51) is not 
by chance and wc indicate a further result in this direction. Recall from 13.11 that 
the value of r* ~ r^^ was obtained from (J3.6I) . and hence that we actually have 
p{£^,a,f3) X ip{tI^), such that the left hand side in (|4.5p equals r^^. We shall see 
next that the corresponding value r* = r^^ is obtained from the same equation (13. 6p 
when basing the direct test on the family TRr — TRr with family R^ = gr{T*T)T* 
as in § 13.11 Then TR^ = gr(TT*)TT* , and we bound its variance and weak 
variance, next. 

Lemma 4.2. Let Rr = TRr = gr{TT*)TT* and denote by resp. Sj and v^ the 
corresponding strong and weak variance. If Assumption\X^ holds then 

(1) S^ < (70 +7*)7oCTW(r), r > 0, and 
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(2) il<o'll 
We also need to bound the bias \\T f - TJr\\ with /^ = gr{T*T)T*Tf 

Lemma 4.3. Assume that f Cz £^. If the regularization g^ has qualification O with 
constant 7 then 

I|r/-TM1 <7e(r). 

Proof. Since fr = RrTf, we get that 

\\Tf - TfrW = ||T/ - gr{T*T)TT * Tf\\ = ||r,(r*r)T/||, 
which is bounded by 76 (r) as soon as f £ £^ and gr has quahfication Q. D 

We recah from §|3]the quantity r^($Q.,/3) := Ca^pSrVr, where we now consider 
Rt and Vr from Lemma |4.2I for bounding ||T/|| > Ca^pSrVr from below. 

Corollary 4.1. Suppose that g^- is a regularization which has qualification Q, f £ 
£ip and that Assumvtion \AS\ holds. Let t^^ be chosen from the equation 

2 QHr) 



(4.6) a' = 



Then 

mf (r2(ci>„,/3) + ||r/-TM|^) < (c„,^ V(7o + 7*)7o7o + 7') Q\r^'')- 

We stress that the equation (j4.6p for determining rP^ is the same equation 
as p.6p . since 9^(t) — rip^^r), and this explains the identical asymptotics in (|4.5p 
as being equal to t^^ = tI^ . 

This result sheds light to another interesting problem: If we want to use the 
regularization TRt, and if we want to have this optimal performance properties 
then the underlying regularization g^ must have higher qualification O for the direct 
problem as compared for its use in inverse testing requiring qualification (/?, only. 
This cannot be seen when confining to spectral cut-off, but this problem is relevant 
when considering other regularization schemes for testing. It is thus interesting 
to design estimators for g = T f which do not rely on estimation of /. However, 
since the data Y do not belong to the space K either discretization or some other 
kind of preconditioning is necessary in order to estimate g = T f from the data Y . 
Such direct estimation is simple by using projection schemes, and we exhibit the 
calculus for one-sided discretization. As in § 13.21 we choose finite (to) dimensional 
subspaces y,„ C K, with corresponding projections Qm and consider the data 

QmY = Qmg + O-Qrn^, m € N. 

This approach is called dual least squares scheme in regularization, see [33]. Here 
it is easy to see that S^ = tr [Q'^Qm] = m, while u^ = ||Qm|| = 1. In order to 
continue we just need that the chosen projections have degree of approximation Q, 
i.e, there is Co for which |1(/ - g„)e(rT*)|| < CDQisla+i), to = 1,2.... With 
this requirement at hand we can continue as if the projections Qm were the projec- 
tions onto the first to. singular elements in the svd of T. In particular we have the 
upper bound on the separation radius 

p{Ee,a, (3) < max{Ca.f!,Cl,} inf {a^y^ + O'^isl.+i)) , 

m 

similar to corresponding results obtained for spectral cut-off in [TJ [T7] , and we omit 
further details. 
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5. Adaptation to the smoothness of the alternative 

It seems clear from Section 4 that the optimahty of the considered tests strongly 
depends on the regularity (smoothness) of the alternative. In this section, we 
propose data-driven tests that automatically adapt to the unknown smoothness 
parameter. The adaptation issue in test theory has widely been investigated. For 
more details on the subject, we refer for instance to [2], [M] in the direct setting 
(i.e. T = Id) or [TS] in the inverse case for an adaptive scheme based on the singular 
value decomposition of the operator. 

First, we propose a general adaptive scheme. Then, we apply this approach to 
linear regularization over ellipsoids. This methodology can also be extended to 
projection schemes. For the sake of brevity, this extension is not discussed here. 

5.1. A general scheme for adaptation. Assume that we have at our disposal 
a finite collection {Rjueiz of regularization operators satisfying Assumption IA2I 
Then, we can associate to each operator R a level-a test ^a,R- Our aim in this 
section is to construct a test that mimics the behavior of the best possible test 
among the family 7?.. Let \TZ\ denotes the cardinality of the family TZ. We define 
our adaptive test <i>^ as 

(5.1) $* = max$^^ ri- 

^ ' " Ren 1^1' 

The performance of $* is summarized in the following proposition. 
Proposition 5.1. The test introduced in k5.1\) is a level-a test. Moreover 

Pf iK = 0) < /3, 
as soon as 

\\fr>2M^{r\<P^^,n,P) + \\.f-fBf), 

where the term r^ has been introduced in h2.12\) . 
Proof. We first remark that 

^H„($*=l) = PHo(max$^,« = iy 



VflGTC / 



Ren 

since Pho{^-2^,r = 1) = ck/|7^| for all R e TZ. Hence, $* is a level-a test. Now, 
we can investigate the second kind error. Using simple algebra, we get that 

Pf{K=0) = PH„(m^^<^^.R = Q] 

= PH 



< inf Pj 

Ren 

We can conclude using p.ip . D 

Proposition 15.11 proves that the detection radius associated to the test defined 
in (|5.ip is close to the smallest possible one among the family TZ. Thus, we must 
design the set TZ according to two requirements. First, the cardinality |7?.| should be 
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small, in order not to enlarge the detection radius too much. Indeed, the following 
holds true. 

Lemma 5.1. Let C* « the term introduced in \2.11\) . If the family TZ of regular- 
ization schemes has cardinality M := \TZ\ > I, then 



CUrB < C*s + 2V21og(M). 



IfM > 4 then Q/^,^^ < {C^^ + 2^/2) .J^^^M) . 

Proof. We first observe that x^/m ~ Xa+ log(A/). Therefore we conclude that 



C*^p = 2,/¥^+2^2x^,M, 

= Cl^p + 2 (v/2x„ + 21og(M) - 72^ 
< Cl^ + 2v/21og(Af). 
The second assertion is trivial, because log(M) > 1 for M > 4. D 



Therefore, the price to pay for using $* is a term of order ^log(|7^|), up to some 
condition on the behavior of the effective dimension (see Theorem 15.11 below) . On 
the other hand, the set TZ should be rich enough to keep the detection radius on 
the size of the best possible bound, as such was established Theorems 13.11 and 13.21 

In the following, we propose practical situations where such an adaptive scheme 
can be used. In particular, we propose families of regularizations operators with 
controlled size and prove that the adaptive test 4>* attains the minimax rate of 
testing (up to a log log term) for a proper choice of TZ. 

Remark 5.1. In the test (15. ip . each regularization operator R ^ TZ is associated to 
a test ^^^n having the same level a/|7?.|. It is nevertheless possible to use more 
refined approaches, leading to an improvement of the power of the test (in terms of 
the constants). We refer to [ini Eq. (2.2)], however in a slightly different setting. 

5.2. Application to linear regularization. We will exhibit the use of the general 
methodology for tests based on linear regularization. 

Let gr be a given regularization. We associate to each function g^ the operator 
Rt and we deal with the family TZ = {Rt)t>o- In order to apply Proposition 15.11 
we need to specify a finite subset TZ C (0, oo) on which the test $* will be based 
on. To this end we will use an exponential grid. Given an initial value T,„ax, and a 
tuning parameter < q < 1 we consider the exponential grid 

(5.2) A, := {t = qV,„ax, j=0,...,Af-l}, for some M > 1. 
Then we use the adaptive test 

(5.3) $; = max '^c./m.t- 

The result from Proposition l5.1l can be rephrased as follows. By virtue of Lemma l5.1[ 
and using the bounds from Lemma [5TT] fc Proposition l3.11 respectively, we find that 
the test <&* bounds the error of the second kind by (3 as soon as 



VM^ 



||/2|| > C{a,f3) inf ^/]^i(M)a^^^^^+\og{M)— + ^^{T) , 

" " tGA, \ T '''I 

for some explicit constant C(a,/3). We shall now show, how we can specify the 
numbers < Tmin < Tmax such that this is of the order of the separation radius (up 
to a log log-factor). 
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The cardinality M obeys Tmin := 'Z^^^^Tmax, and hence M :— logi (rmax/^'n 
Obviously we have that 



V^ 



inf Vlog(M)cr2 V W ^ log(^/) _ + <^2 ^^) 

Tmin<T<T„ax \ T T 



VW) , ,„„,,,,-^ 



2 



< inf ^log{M)a^ ^ — ^ + log(M) — + (^^ (r) 

rSA, \ T ^ / 

The reverse is also true (up to some constant), as proved in the following lemma. 
Lemma 5.2 (cf. 16, Proof of Thm. 3.1]). We have that 



:VW^ 



inf y/log{M)cr^ -^ ^ + log(M) — + v:>2 (^) 

Tmin<T<Tniax \ T T 



> g3/2 inf ( ybi(Mya2 ^^M + log(M) ^ + ^^r)] . 

Proof. For any r with T,„in < r < Tmax we find an index 1 < j < M for which Tj < 

T < ■^j/9- The crucial observation is that the function r — > — is decreasing, 

whereas the function r -^ ^JtMIj) ~ r'^/^ ^ — — is increasing, which can be seen 
from spectral calculus. Therefore, by using the above monotonicity we see that 



VW) 



Vlog(A/)a2 ^ ^ ' + log(M) — + (^2 (^) 
T r 






-^^^^V[^) [^) ^^^^ + .iog(M)^ + ,^(.,) 

> Vi^i(M)a2 f ^) "'^' r//2 V^M + ,3/2 log(M)^ + ^2(r,) 



^" log(A/)— +^^(r,)), 

To T.- 



'0 



from which the proof can easily be completed. D 

We shall next discuss the choices of T,„i„ and Tmax- First, the natural domain of 
definition of the smoothness function Lp is on (0, ||T*r||], such that the choice r^ax = 

\\T*T\\ is natural. In this case the size of v^log(M)g^ ^^^^ + log(M)^ + y^(rinax) 
is at least iy9^(|jT*T||) no matter how small the noise level a was. The next result 
indicates that we can find Tmin in such a way that we can remove the restriction to 
T > Tmin if thcrc is some 'minimal' smoothness in the alternative. 

Lemma 5.3. Let Tmin = Tmin(-A/) satisfy 



(5.4) yb^(]g)^2 V^(^ > ^ 
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// the smoothness tp obeys (p{T^in) < 1 then for < r < T,„in we have that 



VW) 



2 



log{M)a^^ — ^ + log(M)— + ip^ir) 

T T 



2 



> i ( Vi^i(M)-^ ^^^^^ + log(Af ) ^— + ^^ ( 
Proof. For r < Tmin this easily follows from 



ininy 



yj^^^^2 V^M ^ ^2(^^ > VMM)a2vWmiJ ^ ^ 



r r, 



min 



^ -^ I /i — TUT 2 VA/"(rmin) 2/ N 1 

^ \ '^niin / 

which proves the assertion. D 

Remark 5.2. For given cr > the condition from (j5.4p can always be satisfied. 
Below we shall further specify this as follows. If Tmax is chosen as ||r*r|| then 
A/'(Tinax) > 1/2, such that 



V-^(rmin) ^ v/iog(M)a2 _ yioi(M)a2gi-*'^ ^ 1 



^2 



^mi„ - V2T^i„ V2|1T*T|1 - V2|ir*r||g^^ 

Thus the condition (|5.4p holds for 

M > logi/, (^/2||T*T||) +logi/,(l/a2). 
We summarize the above considerations. 

Proposition 5.2. Suppose that M and r-n-nn o-fe chosen such that |5.4[ j holds. If 
the smoothness function tp obeys (^(Tmin) < 1 then 



:VW) ,1„„.,.^^ 



2 



inf Viog(M^' ^ + log(M)— + (^'(r) 

-reA„ \ r r 



V^ 



< 9-^/^2 inf ybi(I?)a2 V^^^AiZ + log(Af)- + ^\t) . 

0<T<r„,ax \ T ^ / 

The following result summarizes the above considerations; it asserts that the test 
<&* appears to be minimax (up to a log log term) in many cases. 

Theorem 5.1. Let a, /? be fixed and $* the test defined in Ii5.3\} . Suppose that 
Tmax = ||r*T||, T„,i„ IS choscn such that M > logi/, (y2||T*T||) + logi/,(l/fT2). 
Let T* he given from 



„2,_^_,2./,_,_ ,i,vmn) 



(5.5) ^^(T,) = a^yioglogi/^( — ) 

// the underlying smoothness obeys (/'(''"min) < 1 o,nd if 

tKR\ loglog(J2) ._,. ^^ 

(5.6 — -— -2_=ol as cr^O, 

A/(t*) 

</ien t/iere is a constant C > such that 



P^(*:,/3,f^)<C inf lcT^/loglogi/^(^)^/^^ + (^2(^)V 

In particular, as a \ we have that r* \ 0, and hence that there is a constant D 
D[a, P) such that 



. 4 
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Wc shall indicate that the assumption (|5.6p is valid in many cases. 
Lemma 5.4. // there is a constant c > such that the effective dimension obeys 

(5.7) AA(r)>clog(l/T), 
and if the smoothness increases at least as 

(5.8) ^(r) < (loglogi/,(l/T))' 

as T — > 0, then i5. 6]) is valid. 

Proof. The parameter r* is determined from (j5.5p . and under (j5.8p we find that 

- loglog,/,(-) = -^^ < j^^ < r, loglog,/,(-), 

provided that r» is small enough. Monotonicity yields that a^ < r*. But then 
loglog(f/(T^) < loglog(f/T*), and we conclude that 

loglog(l/cr^) loglog(f/T^) f loglog(f/T,) ^ 

AA(r,) - AA(n) - c log(l/r,) °^ '' 

as (T, and hence r*, tend to zero. D 

Remark 5.3. This result covers many of the interesting cases, in particular the ones 
from Examples [3] &IH In these cases Theorem 15 . 1 1 exhibits that the separation radii 
obey 

s/(2s+2t+l/2) 



p($:,,.,/3,£^)<i?Lyioglog^J , and 

respectively. In particular we see that adaptation does not pay an additional price 
for severely ill-posed testing problems. 

Remark 5.4. A similar approach can be used when basing the adaptive test on a 
family of projection schemes. In this case we use a finite family of dimensions 

A2,j„:-{m = 2-'"+^", j = 0, . . . , M - 1} , 

and consider projection schemes with spaces Xm, i^n(m) for "^ G ^2. jo- The above 
reasoning applies, taking into account the correspondence between regularization 
parameter r in linear regularization schemes, and dimensions m ^ l/r . For the 
sake of brevity, this will not be discussed in this paper. 

Appendix A. Inequalities for Gaussian elements in Hilbert space 

Lemma A.l. Let X a Gaussian random variable having values in H. Then, for 
all a: > 0, 



where 



P ( \\xf - E \\xf > x' + 2.xVE II All' ) < cxp ^ 2^2 



v^ := sup E|(A:,w)|^ 

||w||<i 
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Proof. Using, the Cauchy-Schwarz inequality, we first observe that 

(E[||X||] + xf < mxf + x^ + 2.tVE|WF- 
Hence, we get 



p(\\xf-E \\xf >x^ + 2x^jE\\X\n < P (ll^f > (EOl^ll] + x) 

= P{\\X\\>E[\\X\\]+x), 

( x' 
^ ^"P (,"2^ 

where for the last inequality we have used [19!, Lemma 3.1]. D 

Lemma A. 2. Let RY be as in h2.1\l . Then 

Pf {\\RYf - EfWRYf < -2VS^) < A 

where S is from \2.10\) . 

Proof. The proof is a direct extension of the one proposed in [18) for a spectral 
cut-off approach. D 
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